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(57) ABSTRACT 

A system and method form searching an associative memory 
using input key values and first and second hashing sections. 
Key values (Kn) can be hashed in the first hashing section 
(102) to generate first output values H/Kji) that access a 
first store (104). The first store or memory portion (104) can 
include "leaf pointer entries (106-2) and "chunk pointer" 
entries (106-3). A leaf pointer entry (106-2) points at data 
associated with an applied key value. A chunk pointer entry 
(106-3) includes pointer data. If a chunk pointer entry 
(106-3) is accessed, the key value (Kn) is hashed in the 
second hashing section (108) to generate second output 
values H2(Kn) that access a second store or memory portion 
(110). Second hashing section (108) hashes key values (Kn) 
according to selection data SEL stored in a chunk pointer 
entry (106-3). The system may also include a first memory 
portion accessed according to address values from the first 
hashing section and a second memory portion accessed 
according to address values that include outputs from^ the 
second hash section and a chunk base address value. The 
hash based associative system allows for the selection of a 
second bash function that has been precomputed at table 
build time to be perfect with respect to a small set of 
colliding key values, provides a deterministic search time 
independent of the number of table entries or width of the 
search key, and allows for pipelining to achieve highest 
search throughput. 

24 Claims, 7 Drawing Sheets 
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SYSTEM AND METHOD FOR SEARCHING ("buckets"). If the keys stored in the associative memory 

AN ASSOCIATIVE MEMORY UTILIZING system include multiple keys that hash to the same bucket b, 

FIRST AND SECOND HASH FUNCTIONS then when an iq)ut search key hashes to bucket b, some 

fiirthcr "collision resolution" method is required to deter- 

TECHNICAL FIELD s mine which of the keys stored in bucket b — if any — matches 

„ ... ,, . . the search key. Further, even if a bucket b holds only one 

The present mvention relates generaUy to associative ^ ^ ^ ^^j,^ ^^^^^^^ ^ ^ 

memory systems and more particularly to associative ^^^^^ ^^^^ ^^^^^^ ^ ^^^^ ^ 

memory systems that use bash functions. ^ j^^j^^ ^ ..^^^^„ ^3, ^^y^ ^^^^ ^^^^^^ ^^ 

BACKGROUND OF THE INVENTION '° ''^^ '° '^^ =^7* bucket Therefore even when a single 

candidate search result is found, the key stored in the table 

Associative memory systems can typically receive a finst must be compared against the input search key to resolve 

set of data values ('Tceys") as inputs. Each key maps to an such aliases. 

associated data value of a second set ("associated data")- Mathematics has proven that there does exist, for any 

Keys and their associated data form a database. The database 15 particular static set of keys and any table size larger than the 

can then searched by applying a key .value to the associative number of keys, one or more "perfect" hash functions for 

memory system. which no two keys in the set collide. However, mathematical 

Associative memory systems have a variety of applica- results have also shown that for large key sets (thousands to 

tions. Some applications may be optimized to accommodate millions of keys), the computational complexity of finding 

large data structures, others may be optimized for transac- 20 such perfect hash functions is extremely high; and further, 

tion accuracy or reliability, still others may be optimized for the storage complexity of describing a hash function that has 

search speed or update speed. been found is also high. These results make perfect hashing 

A content addressable memory (CAM) is one type of impractical for large, dynamic data sets, 
associative memory. While CAMs can provide relatively A number of conventional approaches have been pro- 
fast search speeds, CAMs also have a relatively high com- posed for addressing hash collisions. One possible approach 
ponent cost. Therefore we seek to achieve high associative would be to select a new hashing function, and then 
memory throughput using denser, less expensive random re-translate the entire current data structure into a new data 
access memories (RAMs). structure without a collision. Such an approach is undesir- 

One way to provide fast access times is to form an ^^Ic as it can consume considerable time and consume 

associative memory system in which a RAM memory loca- considerable computmg resources. 

tion is provided for every possible input key. One example Other conventional approaches for addressing hash func- 

of such a system is shown in FIG. 7. The system of FIG. 7 tion collisions include using a "linked-list." A linked list can 

can receive input key values having "n" bits. Three key access a number of memory entries in series. An example of 

values are shown as Kl, K2 and K3. Input key values can be a system having a linked-list is shown FIG. 8. 

applied to a memory 700 that includes 2'* entries. A key value K21 is applied to a hash function 800. The 

Consequently, for each possible input key value, there is a output of hash function 800 is an address to a memory 802. 

corresponding memory 700 entry. In the particular arrange- In FIG. 8, three different tabic entries (for keys KOI, K97 

ment of FIG. 7, a memory 700 is a random access memory, and K21) map to the same memory location or hash bucket, 

and key values can be applied to the memory 700 as ^ Thus, the address for one entry 804 is shown as (H(KOl)o 

addresses. Three entries corresponding to the key values Kl, H(K97)-H(K21)). The entry 804 includes one of tlie key 

K2 and K3 are shown. Each entry is accessed by an address values KOI and its associated data. Further, the entry 804 is 

that is a key value, and stores data associated with the key linked with a linked-list "next" pointer 806 to a second entry 

value. For example, the application of key value Kl results 808 that includes the key value K97 and its associated data, 

in the associated data value DATA Z being provided by Entry 808 is linked with a linked-list "next" pointer to a third 

memory 700. entry 810 having the key value K21 and its associated data. 

A system with direct mapping can be feasible when the The "next" pointer of this third entry is null, indicating that 

number of possible input key values is small, as for example there are no more entries in the list, 

when the key is a binary number only a few bits wide. In the arrangement of FIG. 8, when the key value K21 is 

However, for wider key values (larger key domain), direct applied, hash function 800 accesses entry 804, The applied 

mapping is impractical, as the resulting memory size key value K21 is compared to the stored key value KOI. 

becomes undesirably large. Further, in most applications, a Because the key values are different, the next entry 808 at 

system stores only a tiny fraction of all possible key value the linked-list pointer 806 is accessed. The applied key value 

permutations. In such a case, a direct mapping approach K21 is compared once again to the stored key value K97. 

results in inefficient use of memory. 55 Because the key values are again different, accesses continue 

For larger key domains, hashing is another conventional according to the linked list pointer 806. Entry 810 is then 

approach. A hash function translates values in one address accessed. The applied key value K21 is once again com- 

space to values in a smaller address space. For example, if pared to the stored key value K21. Because the key values 

a system received 128-bit key values, sudi key values could are the same, the corresponding associated data DATA can 

be translated by a hash function into a set of 16-bit hash ^0 provided as an output value. 

bucket addresses. A drawback to the above-described arrangement is that 

"Collisions" present the major practical challenge in using multiple memory i:&a 4accesses and compare operatio ns may 

hash functions for associative data systems. In our 128-bit be required, up to the lengTh of the longest linked-list in the 

key example, if a hash function h(x):{0,l}^^®~*^^'^^^^ maps table in a worst case search. The length of the longest linked 

128-bit keys to 1 6-bit hash bucket indices, a simple counting 65 list depends on the table contents and can grow large, 

argument shows that many different possible 128-bit keys Another conventional approach for addressing hashing 

must hash to each of the 64K different addressable locations function collisions includes a search tree. In one particular 
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case, a search tree uses a number of search criteria to arrive first memory location is a "chunk pointer" entry that pro- 

at the desired associated data. An example of a collision vides second hash function parameters. Second hash func- 

resolution system having a binary search tree shown FIG. 9. tion parameters are used to generate a second perfect hash 

TheexampleofFIG. 9 includes some of the same general value that selects between the multiple key values that 

items as FIG. 8. A key value K31 is appUcd to a hash 5 colUde al a parUcular first memory location, 

function 900. The output' of hash function 900 is an address According to one aspect of the above embodiment, a 

to a memory 902. In FIG. 9, four different key values (K62, pointer entry can include a chunk base address within a 

K45, K72 and K31) hash to the same memory entry. Thus, second memory. The chunk base address can be combined 

the address for one entry 904 is shown as (H(K62)=H(K45)« with second outputs generated by the second hash function 

H(K72)=H(K31)). The entry 904 can activate a binary lO to generate a second memory address. The second memory 

search operation to select among the data associated with the address stores a pointer to a location in a third memory, 

four possible key values (K62, K45, K72 and K31), As just where a key value and associated data corresponding to the 

one example, a particular pointer value SEARCH can be key value are stored. 

stored in entry 904. The output of this value SEARCH can According to another aspect of the above embodiment, 

cause a particular binary tree search t o be performed. ^Yien one input key maps to a first memory location at which 

One particular binary search arrangement is illustrated by there is no collision, the first memory location points to an 

search steps 906-1 to 906-3d, In seardi step 906-1, the entry in a third memory that includes a key value and 

applied key value is compared to a predetermined value to associated data corresponding to a table key. 

select two of the four possible key values. Search steps According to another aspect of the above embodiment, 

906-2fl and 906-2/) can select one key value from two. ^ when no input keys map to a first memory location, the first 

Search steps 906-3a and 906-36 can provide the data asso- memory location can be a null entry that includes data 

ciated with a particular key value at the leaf level. Id HG. indicating that there is no stored data that is associated with 

9, data values DATA I, DATA J, DATA K and DATA L are the corresponding input key, or with any input key that 

associated with key values K31, K45, K62 and K72, respec- hashes to that address. 

lively At the selected leaf of the binary tree search a " According to one aspect of the above embodiments, a 

compare against a stored key value is performed, to resolve ^^^^^ ^^^^ ^^^^-^^ ^^^^ ^^1^^^ ^ ^^^^ 

aliasing. output space. The size of the second output space can be 

A drawback to the above-described arrangement is that selectable, 

the various search steps add to an access time. In particular, 3^ According to another aspect of the embodiments, a first 

a binary search among "m" different values can require ^^^^^^ ^^^^-^^ j^^^^^^ ^^y^^ ^^j^ 

log^m search and compare steps. The number of colhsions tiplication and division (modulo operation), 
that occur al each table location is dependent on the contents 

of the table. For randomly distributed hash function output, BRIEF DESCRIPTION OF THE DRAWINGS 

*^P1im^'^'" '^^ ^nllkion.;; pp.r Inratinn tenH5^ tn he relatively ^^^^ ^ . » ^ a . w . c 

s^mTb^r there is a certain probability of encountering a F'G. 1 is a block diagram of a first embodiment of an 

iaSer number of collisions, which would result in longer fsocialive memory system consistent with the pnnciples of 

search time. This property makes it impossible to set a tight "^^ invention. 

upper bound or worst-case on the number of search steps. In FIG. 2 is a flow diagram of a search operation according 

many real-time applications, deterministic performance is ^ to an embodiment of the present invention, 

required. Further, for maximum throughput performance, it FIG. 3 is a flow diagram of a method for adding asso- 

is desirable to fully pipeline a search algorithm such that ciative data according to an embodiment of the present 

each discrete step can be executed by separate dedicated invention. 

hardware. Without a delerministic number of steps, an pjo 4 is a block diagram of a second embodiment of an 

algorithm cannot be fully pipelined in this way. associative memory system consistent with the principles of 

It would be desirable to arrive at some way of providing the invention 

an associative data system that can have the memory size pjQ 5 ^ ^^^^y^ diagram of a second hashing function 

advantages of search systems using hash functions, but not calculator according to one embodiment, 

suffer from indeterminate access times that may arise from ^ ^ ^ ^^^^^^^.^ ^ ^ ^^^^ ^^^^ qj^2] 

hash collisions. Such a system, to be practical, must also 50 jQultinlier circuit 

permit eflacient update of table contents, without the large . * 

pre-processing times required for perfect hash functions of ^ ^ f ^^5^^°^ associative memory system 

large key sets according to a first conventional example. 

FIG. 8 is a diagram of an associative memory system 

SUMMARY OF THE INVENTION according to a second conventional example. 

According to one embodiment, an -associative data system FIG. 9 is a diagram of an associative memory system 

can receive input key values. A first hashing function maps according to a third conventional example, 

the input key values into first output values. The number of DETAILED DESCRIPTION OF THE 

first output values is smaller than the number of aU possible E^ffiODIMeTO 

key values. When a set of different table key values collides 60 

at the same first output value, a second small perfect bash Various embodiments of the present invention will now be 

function maps the set of colliding key values to second described in conjunction with a number of diagrams. The 

output values. Thus, essentially all key searches can be first embodiment is directed to a system that can store data 

accomplished in two accesses. values which correspond to input key values. A method for 

According to another embodiment, a first hash function 65 operating the associative memory is also described, 

maps input key values to first memory locations. When Referring now to FIG. 1, a first embodiment of an 

multiple keys map to the same first memory location, the associative memory system consistent with the principles of 
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the invendon is set forth ia a block diagram and designated 
by the general reference character 100. System 100 includes 
a first bashing section 102 and a first store 104. Input keys 
(Kn) are received by the first hashing section 102. The first 
hashing section 102 performs a first hash function on input 
key values (Kn) to generate first output values H j(Kn). First 
output values (Hj(Kn)) are used to access locations in first 
store 104. 

In the particular arrangement of FIG. 1, key values Kn are 
128-bit binary numbers, but other data widths are also 
possible. The first hashing section 102 maps key values to 
locations within the first store 104. The number of entries in 
first store 104 — which number is the size of the output range 
provided by hashing section 102 — will generally be selected 
to be larger than the number of keys to be stored in the 
system. Still, the distribution of hashed keys into hash 
buckets is not perfectly balanced in general, so that some 
locations in first store 104 will be empty (0 keys), some will 
correspond to exactly 1 key, and some will correspond to 
multiple keys ("collisions"). 

Based on these three cases, each word stored within first 
store 104 can lake one of three forms. A location can contain 
a "null" entry (one of which is shown as 106-1), a "leaf 
pointer" entry (one of which is shown as 106-2), or a "chunk 
pointer" entry (one of which is shown as 106-3). A null entry 
106-1 can indicate that the system 100 holds no key that 
hashes to the same bucket as an applied key value — hence, 
that it cannot contain the key in question. A leaf pointer entry 
106-2 can indicate that the system 100 holds exactly one key 
that hashes to the same bucket as the applied key value. 

A chunk pointer entry 106-3 can indicate that more than 
one key has mapped to the corresponding 106-3 location. 
Key values that resuh in the same hashing output value will 
be referred to herein as "colliding" key values. When a 
chunk pointer entry 106-3 is accessed by a key value, 
information retrieved firom the chunk pointer location 106-3 
is used as a parameter to a second hash function. The second 
bash function selects a memory location corresponding to 
the applied key value. The memory location is selected from 
multiple memory locations that store the colliding key 
values. 

System 100 thus further includes a second hashing section 
108 and a second store HO. If an input key (Kn) accesses a 
chunk pointer entry 106-3 in first store 104, the second 
hashing section 108 performs a second hash function on the 
colliding input key value to generate a second output value 
H2(Kn, SEL). A second hash function can be parameterized 
according to information provided by the corresponding 
chunk pointer entry 106-3. Parameter information is shown 
as P,Q in FIG. 1. This can allow for the selection of a second 
hash function that has been precomputcd at table build time 
to be perfect with respect to the small set of colliding key 
values. 

Locations within second store 110 are arranged in 
"chunks." One exemplary chunk is shown in FIG. 1 as item 
112. A chunk 112 corresponds to the output range of a given 
second hash function. The set of colliding key values is 
stored within this space. In the particular example of FIG. 1, 
chunk 112 includes entries for three colliding key values, 
shown as COLLIDE KEYl, COLUDE KEY2 and COL- 
LIDE KEY3. 

In one particular arrangement, a chunk pointer entry 
106-3 can provide a base address value for a chunk, chBase, 
Thus, this base value is combined with a second output value 
from second hashing section 108 to generate a particular 
second store HO location. 
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According to one particular arrangement, a first hash 
function implemented by first hashing section 102 includes 
operations in a finite Galois field GF[2"]. More particularly, 
the first hash function can include multiplication and modulo 
function in the ring GF[2][x]. Even more particularly, the 
first hashing function can be given by the equation set forth 
below. In the preferred embodiment, the *, + and mod 
functions arc interpreted as operations in the ring of poly- 
nomials over GF[2], and the input operands are interpreted 
as polynomials over GF[2]. This form of arithmetic is well 
known in the art, as it is used in standard cyclic-redundancy 
check (CRC) coding, as well as other types of error-control 
codes and hash functions, among other applications. 

Hi{KH(^*^*B)mo6 Pjmod Q 

The value K is a binary key value. The value A is a 
constant. The value B is also a constant, and in some 
embodiments is zero. P is an irreducible polynomial of 
degree n over GF[2], where n is the number of bits in the 
binary key value K, in this case 128. Q is an irreducible 
polynomial over of degree m over GF[2], where m is less 
than n, and 2"* is the number of entries in a first memory. For 
this embodiment the number of entries addressed by the first 
hash function must therefore be a power of two. 

The first hashing function can be implemented as a 
Boolean expression, rather than utilizing multiplication and 
division (modulo) circuits. Since the multiplicand A, the 
addend B, and the moduh P and Q are constants, each bit of 
the output may be expressed as an exclusive -OR (XOR) of 
a certain number of input bits to compute the function Hjas 
a fixed-function circuit. 

According to one particular arrangement, a second hash- 
ing function implemented by second hashing section 108 
can include operations in the ring of polynomials over 
GF[2]. More particularly, the second hashing function can 
include multiplication and modulo operation. Even more 
particularly, the second hashing function can be given by the 
equation set forth below. 
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The value K is a binary key value. The value p is a 
selectable parameter retrieved in the first hash function 
access. P is an irreducible polynomial of degree n over 
GF[2], where n is the number of bits in the binary key value 
K. is an irreducible polynomial over GF[2]. The degree 
of is selectable according to the parameter q, also 
retrieved in the first hash function access. In one very 
particular arrangement, the value p can range from 0 to 
1023. The degree of is selectable from 1 to 8. 

By making the degree of selectable, the resulting 
output range of a second hashing function is selectable. 
Consequently, in an arrangement such as that of FIG. 1, 
chunk sizes are selectable. This may have a ntimber of 
advantages. Smaller chunk sizes may result in more ef&cient 
use of a second store. Larger chunk sizes may lower the 
probabUity of an undesirable double "collision" during the 
creation of an associative data structure. A double collision 
can residt when a key value collides at a first store location, 
generated by a first hash function, and then also collides with 
another key value at a second store location, generated by a 
second hash function. Still further, such an arrangement can 
allow a set of colliding values to be rehashed from a smaller 
chunk into a larger chunk. Such an operation may be 
advantageous when one or more added key values map to a 
chunk that already includes data associated with other key 
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values. Rehashing the smaller chunk can make room for the 
new key value data and/or decrease the probability of a 
double coUisioD in the chunk. 

By making the value p selectable, various second hash 
functions may be used in the second hashing section 108. 
This can enable the second hash function to be switched to 
more optimally hash a set of colliding key values. 

A third store 114 stores one record for each key in the 
table. The record contains a copy of the key, to be used for 
resolving aliases, and the associated data to be returned in a 
successful find of that key. The third store 114 is accessed 
whenever either the first or the second store returns a leaf 
pointer. 

Referring now lo FIG. 2, a search operation according to 
an embodiment consistent with the principles of the inven- 
tion is set forth in a flow diagram. T he scarcb operation is 
designated by the general reference^haracter iOOrThe 
search operation 200 includes a first h ashin g step 202. A first 
hashing step 202 hashes an mpul key value according to a 
first hashing function Hj(Kn). A resulting hash outputJialue 
is used to access a first store (step 204). 

How a search operation continues can depend upon the 
type of data accessed, such as a chunk pointer or a null value 
within the first store. In the event the data indicates a null 
value (determined in step 208), a "no_match" value i s 
generated (step 210). This indicateslhat the search was not 
successful. In the event the data indicates a leaf pointer (i.e., 
not a chunk pointer (determined in step 206) and not a null 
value (determined in step 208)), the third store 114 is 
accessed (step 212) to produce an alias . An alias compare 
is then performed and Uie stored key value is tested against 
the input search key (step 214): if they match, associated 
data can be output (step 216), if not, "no match" is output 
(step 210) based on the failed alias test. 

It is noted that a null value is a single access operation, 
and a leaf pointer returned by the first access is a single 
pointer access plus single compare operation. This can be 
conceptualized by first access indicator 218 in FIG. 2. 

In the event data accessed within first store 104 is a chunk 
pointer value (determined in step 206), chunk pointer data is 
used to drive subsequent steps (step 220). A search operation 
then continues with a second hashing step 222. The second 
hashing step 222 hashes the input key value according to a 
second hash function H2(Kn, p, q). The second hash function 
can be parameterized according to pointer data retrieved in 
the first memory access. 

A resulting second hashing step output value is used, 
along with chunk base pointer data from the first access, to 
access a second store (step 224). The resuh of this second 
access is either a null pointer, or a leaf pointer (determined 
in step 208). In the case of a leaf pointer, third store 114 is 
accessed (step 212), an alias compare is done (step 214), and 
if there is a match (the address or key is not an alias), then 
the associated data fetched firom the third memory can be 
output (step 216). 

It is noted that an access to the second store can be a 
double access plus single compare operation. This can be 
conceptualized by first and second access indicators 218 and 
226. AH key values can be searched in two steps plus one 
compare. Unlike conventional hashing approaches, this 
number of search steps is both deterministic and small. 

The above search description has assumed a precomputed 
search data structure stored in the three memories. We now 
describe how this data structure can be built and how entries 
may be added or deleted dynamically. Referring now to RG. 
3, a method for adding associative data is set forth in a flow 
diagram and designated by the general reference character 
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300. Key and associated data are stored in third store 114 and 
pointer data to the locations are saved (step 302). 

The data adding operation method 300 includes a first 
hashing step 304. A first hashing step 304 hashes an input 
5 key value according to a first hash function Hi(Kn). A 
resulting hash output value is used to access a first store (step 
306). 

How an adding operation continues can depend upon the 
type of data accessed within the first store (determined in 
10 step 308). In the event the data indicates a null value, a 

pointer to the new key value and its associated data are 
written into the first store location (step 310). 

In the event the data indicates a leaf pointer value or 
chunk pointer value (determined in step 308), the adding 

15 operation 300 continues with a second hashing loop 312 to 
find suitable parameter values. A second hashing step 314 
hashes all colliding values with a second hash function 
H2(Kn, p, q) for particular values of p, q. The second hash 
function H2(Kn, p, q) is performed according to one set of 

20 many possible selection criteria. 

The resulting second hashing function output values are 
checked for collisions in the second store (step 316). That is, 
if the colliding key values from the first hash operation hash 
into different hash output values using the second hash 

25 operation with the candidate p, q parameters, no collision 
exists. However, if two or more of the key values hash to the 
same output value using the candidate p, q parameters, a 
collision has occurred. 
As shown in FIG. 3, in the event of a collision in the 

30 second store (determined in step 316), new second hashing 
fiinctionparameters are selected (step 318). The method can 
then return to step 314, to "re-hash" the colliding key values 
according to a new second hashing function. 
In the event no collision exists in the second store 

35 (determined in step 316), a new chunk is built in a second 
store. In particular, pointers to key values and their associ- 
ated data are written in chunk locations according to their 
corresponding second hashing function output values (step 
320). Pointer information for the newly formed chunk is 

40 then written into the corresponding pointer (or former leaf) 
location in the first store (step 322), thus completing the add 
of the new key to the system. 

Referring now lo FIG. 4, a second embodiment of an 
associative memory system consistent with the principles of 

45 the invention is set forth in a block diagram and designated 
by the general reference character 400. The system 400 
includes a processing system 402 and a memory system 404 
^^MTlItionly '^"ptfid-to^ n address bus 406 and a data bus 40 8. 
fioccssing system 402 includes a first hash function calcu- 

50 lator 410, a second hash function calculator 412, and an 
adder 414. 

A first hash function calculator 410 receives key values 
from data bus 408, and performs a first hash function on such 
values. The first hash ftioction results can be applied to the 

55 memory system 404 on address bus 406. A second hash 
function calculator 412 receives a key value, function 
parameter value (SEL), and a chunk base value (chBase). A 
particular second hash function, from a family of many 
possible second hash functions, can be selected according to 

60 the SEL values. The selected second hash function hashes a 
key value. The resulting second hashing function output 
value is added to the chBase value to generate a second 
memory address. The second memory address can be placed 
on address bus 406. 

65 Processing system 402 can include a number of 
stmctures, such as a general-purpose processor device that 
includes registers and arithmetic/logic circuits, and executes 
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a series of instructions to calculate a first and/or second 
hashing function. Alternatively, such a processor system 
may be implemented all, or in part, as a portion of an 
application specific integrated circuit (ASIC). Of course, the 
above-mentioned examples represent but two possible 
implementations. 

The memory system 404 can include a first memory 
portion 416 and a second memory portion 418. First memory 
portion 416 can include a number of entries that can be 



value Hj(K2) is applied as an address value to first memory 
portion 410. In response to the address H^(K2), first memory 
portion 410 acc esses pointer e ntry 420-2 and outputs the 
pointer data chBas c and SEL on data bus 408. The chBase 
and SEL values are applied to the second hashing function 
calculator 412 along with the key value K2. 

The SEL data includes particular hashing function param- 
eters that can select from a niunbcr of hash fiinctions. The 
key value K2 is then hashed according to a selected hash 



accessed according to address values generated by the first lO function and output as a second output value H2(K2, SEL). 

hash function calculator 410. Second memory portion 418 This value H2(K2, SEL) is added to the chBase data and 

can include a number of entries that can be accessed applied as an address vdue to second memory portion 418. 

according to address values that include outputs from the In response to the chBase +address HJKlj SEL), second 

second hash function calculator 412. memory portion 410 accesses entry 424-2 and then com- 

In one particular embodiment, first memory portion 416 is pares the stored key and the search key. In the example of 



can be a random access memory (RAM). Second memory 
portion 418 can also be a RAM. Further, the first and second 
memory portions may be different sections of the same 
RAM device. 

The first memory portion 416 can include leaf entries with 20 
key and associated data information like 420-1, chunk 
pointer entries like 420-2, and null entries like 420-3. 
Second memory portion 418 can include a chunk 422 
consisting of key and data information for several keys and 



FIG. 4, the compare is TRUE, so it outputs the data value 
DATAje^ on tiata bus 408. In this way, a data value associ- 
ated with the K2 key value can be accessed from the 
associative memory system. 

In a fourth operation, key value KX is applied to the 
system and received by first hash function calculator 410. 
The value can be hashed and output as first output value 
Hi(KX), where it happens that Hi(KX)=H,(K2)=Hi(K3), 
although the key KX is not stored in the table. The value 



possibly null entries for empty spaces in the chunk. Note that 25 Hi(KX) is applied as an address value to first memory 



this embodiment has no third memory portion, as the key 
and associated data information are stored directly in the 
first and second memory portions . 

In the particular arrangement of FIG. 4, leaf enU-y 420-1 
has an address Hj{Kl) and stores associated data DATA^^ 
Chunk pointer enu-y 420-2 has an address Hi(K2)«Hi(K3) 
and stores the chBase and SEL values for the chunk. Null 
entry 420-3 has an address Hi(K4) and stores associated 
data KULL. 



portion 410. The KX value "collides" with the K2 value. 
Consequently, in response tolBe^ddress Hi(KX), first 
memory portion'410 accesses the same pointer entry 420-2 
accessed by the K2 value. The same chunk pointer data 
. 30 chBase and SEL are placed on data bus 408 once again. The 
chBase and SEL values are applied to the second hashing 
function calculator 412, this time along with the search key 
value KX. The second hash of KX may point to a null chunk 
entry, in which case no-match is output immediately. 



Within second memory portion 418, chunk 422 occupies 35 Alternatively, the second hash of KX may point to a location 

a series of consecutive entries. Three entries of the chunk within the chunk where a different key is stored. (The perfect 

422 are detailed. Entry 424-1 is the first entry of the chunk hash constructed for the chunk applies only to those keys 

and has an address (chBase +0) and stores associated data actually stored in the chunk. Since KX is not in the chunk, 

NULL. Entry 424-2 has an address (chBase+H2(K2,SEL)) there is oo guarantee t hat it does not collide with one of th e 

and stores associated data DATA^. Entry 424-3 has an 40 chunk keys.) In this second case, KX will be delected as an 



address (chBase+H2(K3,SEL)) and stores associated data 
DATA^. 

Four search operations for the second embodiment of 
associative data system 400 will now be described. 

In a first operation, key value K4 is applied to the system 45 
and received by first hash function calculator 410. Key value 
K4 is hashed to generate first output value Hj(K4). The first 
output value Hi(K4) is applied as an address value to first 
memory portion 410. In response to the address H](K4), first 



alias at compare time and no-match will be output. 

In one very particular arrangement, the second hashing 
function can be the previously described hashing function. 

The SEL data is a 13-bit value that comprises two fields: 
a 10-bit value that corresponds to the p value, and a three-bit 
value that corresponds to the q value (the degree of the 
polynomial Q^). Because p is a 10-bit value, there can be 



memory portion 410 accesses null entry 420-3 and outputs so 1024 possible second hashing functions for each size q. The 



the NULL data value on data bus 408. A NULL data value 
indicates no data associated with the key K4 is stored in the 
system. 

In a second operation, key value Kl is applied to the 
system and received by first hashing function calculator 410. ss 
The value is hashed and output as finst output value H/Kl). 
This value Hi(Kl) is applied as an address value to first 
memory portion 410. In response to the address Hi(Kl), first 
memory portion 410 accesses leaf entry 420-1 and tests the 

search key against the stored key Kl. In the example of FIG. 60 can be simplified, for fixed values A, B, P, Q, as a single 



q value can select polynomials of degree 1 to 8. 
Consequently, each chunk includes 2 to 256 entries. Of 
course, such values have been optimized for particular 
applications and should not be construed as limiting the 
invention thereto. 

As noted in conjunction with the embodiment of FIG. 1, 
a first hash function, such as 



4, the compare is TRUE, so the system outputs the data 
value DATAj^i on data bus 408. In this way, the data value 
associated with the Kl key value (DATA^) can be accessed 
from the associative memory system. 

In a third operation, key value K2 is applied to the system 65 
and received by first hashing function calculator 410. The 
value is hashed and output as first output value H](K2). This 



Boolean expression K, rather than a collection of polyno- 
mial multiplication and division circuits. 

However, such an approach for the second hash functions 
may not be practical in those cases where many different 
hash functions can be selectable according to pointer data 
information. One example of a second hash function calcu- 
lator is illustrated in FIG. 5. 
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A second hash funcdoa calculator is designated by the 
general reference character 500. The hash function calcula- 
tor 500 can execute the fiinction previously described. 

The hash function calculator 500 includes a mulliplier/mod 
P calculator 502. A multiplier/mod P calculator 502 includes 
a multiplier circuit that can receive a key value and multiply 
it by a selectable value p. The value p results in the selection 
of a particular hashing function from many hashing func- 
tions. The resulting polynomial product can be subject to a 
modulo P operation. 

In one particular arrangement, the value p can be a 10-bit 
value, and the multiplier/mod P calculator 502 includes an 
nxlO Galois field GF[2] multiplier, where n is the number of 
bits in a key value. 

Results from the multiplier/mod P calculator 502 are 
supplied to various mod Q, circuits, shown as 504-1 U) 
504-8. Each mod circuit can provide a modulo 
operation, where each Q,- is an irreducible polynomial of a 
different degree. In the particular arrangement of FIG. 5, 
mod Q; circuits 504-1 to 504-8 have degrees of 1 to 8, 
respectively. 

The output of the various mod Q, circuits (504-1 to 504-8) 
is provided to a multiplexer circuit (mux) 506. Mux 506 
selects one of the various mod Q.- circuits (504-1 to 504-8) 25 
according to an input value q. In the particular arrangement 
of FIG. 5, the q value ranges from 1 to 8. 

In this way, a second hash function calculator 500 can 
take a form that may be advantageously implemented as a 
circuit. Such an implementation may hash key values at 
faster speeds than a general-purpose processor executing a 
scries of instructions. 

It is noted that mod P and mod Q operations may be 
performed by various circuits. As just one example, fixed 
exclusive-OR (XOR) trees for particular polynomial 
"divide-by** functions for P and each Q,- are a well-known 
way to implement such functions. 

Referring now to FIG. 6, a schematic diagram is set forth 
illustrating one of the number of table entries or width of the 
search key, and allows for pipelining to achieve highest 40 
search throughput. 

The various embodiments illustrate a hash-based associa- 
tive data system that provides a deterministic search time 
independent of the number of table entries or width of search 
key (except inasmuch as the compare step requires a data 
read equal to the width of the search key). This is in contrast 
to other hashing approaches that may utilize linked lists 
and/or search trees, or other collision resolution methods. 

The system described herein has a certain probability of 
"insert failiu-e." An insert failure occurs when adding a key 
to the table, it is discovered that for a certain set of keys that 
collide in H|, among all the possible S£L parameter values, 
there is none that results in a collision-free H2 output By the 
construction of the particular H2 function chosen herein, 
probability theory may be used to demonstrate that the 
probability of such failure is smaller than 2.33 for key sets 
as large as 10 million keys. Other choices of H2 with similar 
properties could yield similar results. 

The apparatus and methods described above can provide 
searches into a data structure with key values to retrieve 
corresponding associated data within a predetermined num- 
ber of steps. The deterministic search time allows for 
pipelining to achieve highest search throughput. For 
example, such an application may allow for high -throughput 
processing of packet data. Aj acke t processing database c an 
be an -assQciativp memorY^ and packet header fields can be 
applied as keys to the associative memory. 
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It is understood that while the various embodiments 
describe systems that can include a first storage device that 
may include pointers to a second storage device, alternate 
approaches may include further levels of hashing. That is, as 
just one example, a first storage device may have pointers to 
a second storage device which may itself have pointers to a 
third storage device, and so on. 

While the preferred embodiments set forth herein have 
been described in detail, it is understood that the present 
invention could be subject to various changes, substitutions, 
and alterations without departing from the spirit and scope 
of the invention. Accordingly, the present invention is 
intended to be limited only as defined by the appended 
claims. 

What is claimed is: 

1. A system, comprising: 

a first hash function calculator that hashes a search key 
value into a first output value; 

a first store accessed by the first output value; 

a second hash function calculator that hashes the search 
key value according to hashing criteria stored in a first 
store entry accessed by the first output value, thescarch 
ke y value b eing hashed into a second output value; and 

a second store accessed according to the second output 
value and at least one value stored in the first store 
entry. 

2. The system of claim 1, wherein: 

the first hash function calculator includes a Galois field 
multiplier. 

3. The system of claim 1, wherein: 

the first hash function calculator includes a Galois field 
divider. 

4. The system of claim 1, wherein: 

the first store includes leaf entries having data associated 
with the key value of the leaf entry. 

5. The system of claim 1, wherein: 

the first store includes pointer entries that include hashing 
criteria and a chunk base address. 

6. The system of claim 1, wherein: 

the hashing criteria includes a value that selects a par- 
ticular hash function from a number of hash functions. 

7. The system of claim 1, wherein: 

the second hash function calculator includes a Galois field 
multiplier and the hashing criteria include a multipli- 
cand for the Galois field multiplier. 

8. The system of claim I, wherein the second hash 
function calculator includes: 

a Galois field multiplier having an output coupled to a 

plurality of modulo circuits; and 
a multiplexer for selecting an output of one modulo circuit 

according to the hashing criteria. 

9. The system of claim 1, wherein the search key value is 
from a domain of size Dl, the first output value is from a 
range of size R1<D1, and the second output value is from a 
range of size R2<D1. 

10. The system of claim 1, wherein the second output 
value comprises a collision-free hash value. 

11. The system of claim 1, wherein: 

the hashing criteria includes a value that determines a 
width of the second output value. 

12. The system of claim 11, wherein: 

the width of the second output value is from 2^ to 2^. 

13. An associative memory system for receiving an input 
key value, the system comprising: 

a first memory accessed by a first hash function calcula- 
tor; 
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a second memory accessed by a second hash function 
calculator; 

the first hash function calculator receiving key values and 
providing first output values according to a first hash 
function; ^ 

the second hash function calculator generating a second 
output value based on key values and pointer data from 
the first memory, the pointer data corresponding to at 
least two key values and including parameter data and 
a chunk base address; and 

a combiner combining the second output value with the 
chimk base address to generate a second memory 
address. 

14. The system of claim 13, wherein: 

the first memory includes a random access memory. 

15. The system of claim 13, wherein: 

the second memory includes a random access memory. 

16. The system of claim 13, wherein: 

the first hash function includes the operation ^ 

H,(K)M(K*>*+*)inod P)mod Q 

where, * is Galois field multiplication, -i- is Galois field 
addition, mod is a Galois field modulo operation, P and 25 
Q are irreducible polynomials over a Galois field, K is 
a key value, and A and B are constants. 

17. The system of claim 13, wherein: 

the second hash function calculator includes the operation 

30 

H2(K,P,g,i)-l{i^*P+B,)mod P]mod 

where, * is Galois field multiplication, + is Galois field 
addition, mod is a modulo operation, P and are 
irreducible polynomials over a Galois field, with the 
degree of depending upon a value q, q varying 
according to parameter data, K is a key value, p can 
vary according to parameter data, and B,- is a constant 
depending on a value i, i varying according to param- 
eter data. ^° 

18. The system of claim 13, wherein: 

the first output values are first memory addresses. 
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19. The system of claim 13, wherein the second output 
value comprises a oollision-free hash value. 

20. A method of implementing an associative memory, 

comprising the steps of: 
hashing an input key to generate a first output value; 
accessing a first store entry according to the first output 
value; 

if the accessed first store entry is not a chunk pointer entry, 
comparing against stored key data, and either reporting 
a null result, or retrieving and delivering data; and 

if the accessed first store entry is a chunk pointer entry, 
retrieving chunk pointer data, hashing the input key 
according to the chunk pointer data to generate a 
second output vahie, and accessing a second store entry 
according to the second output value. 

21. The method of claim 20, wherein: 

the chunk pointer data includes parameter data and chunk 

base data; and 
the step of hashing the input key according to the chunk 

pointer data includes: 

hashing the input key with a hashing function deter* 
mined by the parameter data to generate a hash 
output value; and 

combining the hash output value and the chtmk base 
data to generate the second output value. 

22. The method of claim 20, wherein: 

each chunk pointer designates the start of a chunk, each 

chunk including x entries; and 
the step of hashing the input key according to the chunk 

pointer data includes hashing the input key into an 

output space x. 

23. The method of claim 20, wherein: 

each chunk pointer designates the start of a chunk, each 

chunk including x entries; and 
the step of hashing the input key according to the chunk 

pointer data includes hashing the input key into an 

output space x. 

24. The method of claim 20, wherein the second output 
value comprises a collision-free hash value. 

* ♦ )^ * « 
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